Search CORE

377 research outputs found

CpG islands or CpG clusters: how to identify functional GC-rich regions in a genome?

Author: AP Bird
C Jiang
C Jiang
D Takai
H Kawaji
IP Ioshikhes
L Han
Leng Han
M Gardiner-Garden
M Hackenberg
M Weber
P Carninci
Zhongming Zhao
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background CpG islands (CGIs), clusters of CpG dinucleotides in GC-rich regions, are often located in the 5\u27 end of genes and considered gene markers. Hackenberg et al. (2006) recently developed a new algorithm, CpGcluster, which uses a completely different mathematical approach from previous traditional algorithms. Their evaluation suggests that CpGcluster provides a much more efficient approach to detecting functional clusters or islands of CpGs. Results We systematically compared CpGcluster with the traditional algorithm by Takai and Jones (2002). Our comparisons of (1) the number of islands versus the number of genes in a genome, (2) the distribution of islands in different genomic regions, (3) island length, (4) the distance between two neighboring islands, and (5) methylation status suggest that Takai and Jones\u27 algorithm is overall more appropriate for identifying promoter-associated islands of CpGs in vertebrate genomes. Conclusion The generation of genome sequence and DNA methylation data is expected to accelerate greatly. The information in this study is important for its extensive utility in gene feature analysis and epigenomics including gene prediction and methylation chip design in different genomes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

VCU Scholars Compass

Fast and Space-Efficient Location of Heavy or Dense Segments in Run-Length Encoded Sequences

Author: A. Nekrutenko
F. Larsen
M. Gardiner-Garden
R.C. Hardison
S. Hannenhalli
X. Huang
Y. Lin Ling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2003
Field of study

This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible items, implying that run-length encoded sequences can be handled in time and space linear in the number of runs. Furthermore, some problem variations can be solved in constant space. Also, these time and space bounds suffice for certain problem variations in which we call for reporting of many “good” subsequences

Crossref

Loyola eCommons

WordCluster: detecting clusters of DNA words and genomic elements

Author: A Sandelin
A Siepel
AR Quinlan
B Giardine
D Durand
D Karolchik
Guillermo Barturen
José L Oliver
KD Pruitt
M Ashburner
M Gardiner-Garden
M Hackenberg
M Hackenberg
M Hackenberg
M Hackenberg
Michael Hackenberg
P Carpena
Pedro Bernaola-Galván
Pedro Carpena
R Aloni
R Lister
TJ Hubbard
VJ Makeev
Ángel M Alganza
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Many <it>k-</it>mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (<it>k-</it>mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <it>WordCluster </it>to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions <it>WordCluster </it>seems to predict biological meaningful clusters of DNA words (<it>k-</it>mers) and genomic entities. The implementation of the method into a web server is available at <url>http://bioinfo2.ugr.es/wordCluster/wordCluster.php</url> including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Repositorio Institucional Universidad de Granada

Paternity test in "Mangalarga-Marchador" equines by DNA-fingerprinting

Author: BIRD A.P.
BROAD T.E.
BRUFORD M.W.
CARLOS EDUARDO ANUNCIAÇÃO
ELLEGREN H.
EPPLEN J.T.
GARDINER-GARDEN M.
GEORGES M.
GILBERT D.A.
HABERFELD A.
JEFFREYS A.J.
JEFFREYS A.J.
JEFFREYS A.J.
MADSEN L.
PENA S.D.J.
RUBERTIS F.
SAKAGAMI M.
SAMBROOK J.
SCHWAIGER F.
SPARTACO ASTOLFI-FILHO
WIJERS E.R.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Knowledge sharing and collaboration in translational research, and the DC-THERA Directory

Author: A. Splendiani
Ashburner
Ball
Bard
Bard
Barrett
Bourne
Brazma
Brazma
Burks
C. Scognamiglio
Ciccarese
D. Cavalieri
Demir
Etzold
Gardiner-Garden
Gardner
Goble
Grenon
Hoehndorf
Howe
Hu
Hucka
Hull
J. M. Austyn
Kinoshita
M. Brandizi
M. Gundel
Masci
Mons
Mons
Rebholz-Schuhmann
Roberts
Rubin
Shotton
Silvertown
Smith
Stein
Stein
Szalay
Webster
Wilkinson
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Biomedical research relies increasingly on large collections of data sets and knowledge whose generation, representation and analysis often require large collaborative and interdisciplinary efforts. This dimension of ‘big data’ research calls for the development of computational tools to manage such a vast amount of data, as well as tools that can improve communication and access to information from collaborating researchers and from the wider community. Whenever research projects have a defined temporal scope, an additional issue of data management arises, namely how the knowledge generated within the project can be made available beyond its boundaries and life-time. DC-THERA is a European ‘Network of Excellence’ (NoE) that spawned a very large collaborative and interdisciplinary research community, focusing on the development of novel immunotherapies derived from fundamental research in dendritic cell immunobiology. In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science

Crossref

PubMed Central

Oxford University Research Archive

Rothamsted Repository

Inheritance of an Epigenetic Mark: The CpG DNA Methyltransferase 1 Is Required for De Novo Establishment of a Complex Pattern of Non-CpG Methylation

Author: A Bird
A Hermann
A Olek
AP Bird
BH Ramsahoye
C Gicquel
C Mund
C Pittoggi
CM Suter
E Li
E Li
François Cuzin
GP White
JE Dodge
KD Robertson
M Gardiner-Garden
M Okano
M Rassoulzadegan
M Rassoulzadegan
MG Goll
Minoo Rassoulzadegan
P Meyer
Ruken Yaman
S Kouidou
Sebastian Fugmann
SR Cherry
T Imamura
T Pelissier
Valérie Grandjean
Publication venue: Public Library of Science
Publication date: 07/11/2007
Field of study

Site-specific methylation of cytosines is a key epigenetic mark of vertebrate DNA. While a majority of the methylated residues are in the symmetrical (meC)pG:Gp(meC) configuration, a smaller, but significant fraction is found in the CpA, CpT and CpC asymmetric (non-CpG) dinucleotides. CpG methylation is reproducibly maintained by the activity of the DNA methyltransferase 1 (Dnmt1) on the newly replicated hemimethylated substrates (meC)pG:GpC. On the other hand, establishment and hereditary maintenance of non-CpG methylation patterns have not been analyzed in detail. We previously reported the occurrence of site- and allele-specific methylation at both CpG and non-CpG sites. Here we characterize a hereditary complex of non-CpG methylation, with the transgenerational maintenance of three distinct profiles in a constant ratio, associated with extensive CpG methylation. These observations raised the question of the signal leading to the maintenance of the pattern of asymmetric methylation. The complete non-CpG pattern was reinstated at each generation in spite of the fact that the majority of the sperm genomes contained either none or only one methylated non-CpG site. This observation led us to the hypothesis that the stable CpG patterns might act as blueprints for the maintenance of non-CpG DNA methylation. As predicted, non-CpG DNA methylation profiles were abrogated in a mutant lacking Dnmt1, the enzymes responsible for CpG methylation, but not in mutants defective for either Dnmt3a or Dnmt2

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

Author: A Bird
A Marson
A Rodriguez
A Sandelin
A Sandelin
AP Bird
Arindam Bhattacharjee
Ben Gordon
CD Schmid
Christopher K. Patil
D Karolchik
David L. Corcoran
DL Corcoran
DP Bartel
DS Prestridge
DS Prestridge
E Wingender
F Ozsolak
GD Stormo
GG Loots
GM Borchert
H Wakaguri
HJ Bussemaker
HK Saini
I Rigoutsos
IP Ioshikhes
J Taylor
J van Helden
K Woods
KD Taganov
Kusum V. Pandit
M Gardiner-Garden
M Megraw
MJ Buck
MP Brown
N Liu
Naftali Kaminski
NJ Martinez
O Chapelle
P Carninci
P Jin
Panayiotis V. Benos
R Gangal
R Shalgi
RM Kuhn
S Baskerville
S Fujita
S Mahony
S Mahony
SJ Cooper
T Abeel
T Thum
T Wang
TA Down
U Ohler
U Ohler
WJ Kent
X Zhao
X Zhou
Y Lee
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2009
Field of study

Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Particle Swarm Optimization with Reinforcement Learning for the Prediction of CpG Islands in the Human Genome

Author: AG Barto
C Jiang
CD Davis
Cheng-Hong Yang
D Takai
F Fang
H Lai
Hsiu-Chen Huang
HW Ressom
J Hancock
J Kennedy
JP Egan
L Han
L Ponger
Li-Yeh Chuang
LY Chuang
M Gardiner-Garden
M Hackenberg
M Hackenberg
M Tykocinski
MF Kane
Ming-Cheng Lin
P Rice
R Illingworth
R Lister
R Poli
RK Hanson
S Whitehead
S Yegnasubramanian
VG Gudise
Vladimir Brusic
Y Lin
Y Sujuan
Publication venue: Public Library of Science
Publication date: 28/06/2011
Field of study

BACKGROUND: Regions with abundant GC nucleotides, a high CpG number, and a length greater than 200 bp in a genome are often referred to as CpG islands. These islands are usually located in the 5' end of genes. Recently, several algorithms for the prediction of CpG islands have been proposed. METHODOLOGY/PRINCIPAL FINDINGS: We propose here a new method called CPSORL to predict CpG islands, which consists of a complement particle swarm optimization algorithm combined with reinforcement learning to predict CpG islands more reliably. Several CpG island prediction tools equipped with the sliding window technique have been developed previously. However, the quality of the results seems to rely too much on the choices that are made for the window sizes, and thus these methods leave room for improvement. CONCLUSIONS/SIGNIFICANCE: Experimental results indicate that CPSORL provides results of a higher sensitivity and a higher correlation coefficient in all selected experimental contigs than the other methods it was compared to (CpGIS, CpGcluster, CpGProd and CpGPlot). A higher number of CpG islands were identified in chromosomes 21 and 22 of the human genome than with the other methods from the literature. CPSORL also achieved the highest coverage rate (3.4%). CPSORL is an application for identifying promoter and TSS regions associated with CpG islands in entire human genomic. When compared to CpGcluster, the islands predicted by CPSORL covered a larger region in the TSS (12.2%) and promoter (26.1%) region. If Alu sequences are considered, the islands predicted by CPSORL (Alu) covered a larger TSS (40.5%) and promoter (67.8%) region than CpGIS. Furthermore, CPSORL was used to verify that the average methylation density was 5.33% for CpG islands in the entire human genome

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Comparative analysis of sequence characteristics of imprinted genes in human, mouse, and cattle

Author: A Lewis
AJ Wood
B Hutter
B Neumann
D Monk
E Allen
Eui-Soo Kim
H Khatib
H Kobayashi
H Shirohzu
Hasan Khatib
I Zaitoun
IM Morison
Ismail Zaitoun
J Kim
J Walter
JK Killian
JM Greally
JR Weidman
K Higashimoto
K Okamura
M Gardiner-Garden
M Paulsen
MF Lyon
MF Lyon
NT Ruddock
S Engemann
S Zhang
SV Dindot
Y Mizuno
Publication venue: Springer-Verlag
Publication date: 01/01/2007
Field of study

Genomic imprinting is an epigenetic mechanism that results in monoallelic expression of genes depending on parent-of-origin of the allele. Although the conservation of genomic imprinting among mammalian species has been widely reported for many genes, there is accumulating evidence that some genes escape this conservation. Most known imprinted genes have been identified in the mouse and human, with few imprinted genes reported in cattle. Comparative analysis of genomic imprinting across mammalian species would provide a powerful tool for elucidating the mechanisms regulating the unique expression of imprinted genes. In this study we analyzed the imprinting of 22 genes in human, mouse, and cattle and found that in only 11 was imprinting conserved across the three species. In addition, we analyzed the occurrence of the sequence elements CpG islands, C + G content, tandem repeats, and retrotransposable elements in imprinted and in nonimprinted (control) cattle genes. We found that imprinted genes have a higher G + C content and more CpG islands and tandem repeats. Short interspersed nuclear elements (SINEs) were notably fewer in number in imprinted cattle genes compared to control genes, which is in agreement with previous reports for human and mouse imprinted regions. Long interspersed nuclear elements (LINEs) and long terminal repeats (LTRs) were found to be significantly underrepresented in imprinted genes compared to control genes, contrary to reports on human and mouse. Of considerable significance was the finding of highly conserved tandem repeats in nine of the genes imprinted in all three species

Crossref

Springer - Publisher Connector

PubMed Central

A Meta-Analysis of Microarray Gene Expression in Mouse Stem Cells: Redefining Stemness

Author: A Sandelin
A Smith
AI Su
BE Bernstein
D Baek
David T. Jones
H Parkinson
JL Attema
JR Landry
K Kimura
K Tsuritani
Kevin Bryson
LO Barrera
M Ashburner
M Buszczak
M Gardiner-Garden
M Grskovic
M Ramalho-Santos
NB Ivanova
NB Ivanova
NO Fortunel
P Carninci
P Flicek
P Rice
PA Jones
RA Irizarry
RC Gentleman
RH Waterston
S Falcon
S Fukada
S Prabhakar
T Barrett
TA Venezia
TB Miranda
TJ Hubbard
Winston Hide
Yvonne J. K. Edwards
Publication venue: Public Library of Science
Publication date: 01/07/2008
Field of study

While much progress has been made in understanding stem cell (SC) function, a complete description of the molecular mechanisms regulating SCs is not yet established. This lack of knowledge is a major barrier holding back the discovery of therapeutic uses of SCs. We investigated the value of a novel meta-analysis of microarray gene expression in mouse SCs to aid the elucidation of regulatory mechanisms common to SCs and particular SC types.We added value to previously published microarray gene expression data by characterizing the promoter type likely to regulate transcription. Promoters of up-regulated genes in SCs were characterized in terms of alternative promoter (AP) usage and CpG-richness, with the aim of correlating features known to affect transcriptional control with SC function. We found that SCs have a higher proportion of up-regulated genes using CpG-rich promoters compared with the negative controls. Comparing subsets of SC type with the controls a slightly different story unfolds. The differences between the proliferating adult SCs and the embryonic SCs versus the negative controls are statistically significant. Whilst the difference between the quiescent adult SCs compared with the negative controls is not. On examination of AP usage, no difference was observed between SCs and the controls. However, comparing the subsets of SC type with the controls, the quiescent adult SCs are found to up-regulate a larger proportion of genes that have APs compared to the controls and the converse is true for the proliferating adult SCs and the embryonic SCs.These findings suggest that looking at features associated with control of transcription is a promising future approach for characterizing “stemness” and that further investigations of stemness could benefit from separate considerations of different SC states. For example, “proliferating-stemness” is shown here, in terms of promoter usage, to be distinct from “quiescent-stemness”

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Enlighten